decoder part
SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks
Improving the efficiency of state-of-the-art methods in semantic segmentation requires overcoming the increasing computational cost as well as issues such as fusing semantic information from global and local contexts. Based on the recent success and problems that convolutional neural networks (CNNs) encounter in semantic segmentation, this research proposes an encoder-decoder architecture with a unique efficient residual network. Attention-boosting gates (AbGs) and attention-boosting modules (AbMs) are deployed by aiming to fuse the feature-based semantic information with the global context of the efficient residual network in the encoder. Respectively, the decoder network is developed with the additional attention-fusion networks (AfNs) inspired by AbM. AfNs are designed to improve the efficiency in the one-to-one conversion of the semantic information by deploying additional convolution layers in the decoder part. Our network is tested on the challenging CamVid and Cityscapes datasets, and the proposed methods reveal significant improvements on the existing baselines, such as ResNet-50. To the best of our knowledge, the developed network, SERNet-Former, achieves state-of-the-art results (84.62 % mean IoU) on CamVid dataset and challenging results (87.35 % mean IoU) on Cityscapes validation dataset.
Evolutionary Neural Architecture Search for Transformer in Knowledge Tracing
Yang, Shangshang, Yu, Xiaoshan, Tian, Ye, Yan, Xueming, Ma, Haiping, Zhang, Xingyi
Knowledge tracing (KT) aims to trace students' knowledge states by predicting whether students answer correctly on exercises. Despite the excellent performance of existing Transformer-based KT approaches, they are criticized for the manually selected input features for fusion and the defect of single global context modelling to directly capture students' forgetting behavior in KT, when the related records are distant from the current record in terms of time. To address the issues, this paper first considers adding convolution operations to the Transformer to enhance its local context modelling ability used for students' forgetting behavior, then proposes an evolutionary neural architecture search approach to automate the input feature selection and automatically determine where to apply which operation for achieving the balancing of the local/global context modelling. In the search space, the original global path containing the attention module in Transformer is replaced with the sum of a global path and a local path that could contain different convolutions, and the selection of input features is also considered. To search the best architecture, we employ an effective evolutionary algorithm to explore the search space and also suggest a search space reduction strategy to accelerate the convergence of the algorithm. Experimental results on the two largest and most challenging education datasets demonstrate the effectiveness of the architecture found by the proposed approach.
LRDB: LSTM Raw data DNA Base-caller based on long-short term models in an active learning environment
Rezaei, Ahmad, Taheri, Mahdi, Mahani, Ali, Magierowski, Sebastian
The first important step in extracting DNA characters is using the output data of MinION devices in the form of electrical current signals. Various cutting-edge base callers use this data to detect the DNA characters based on the input. In this paper, we discuss several shortcomings of prior base callers in the case of time-critical applications, privacy-aware design, and the problem of catastrophic forgetting. Next, we propose the LRDB model, a lightweight open-source model for private developments with a better read-identity (0.35% increase) for the target bacterial samples in the paper. We have limited the extent of training data and benefited from the transfer learning algorithm to make the active usage of the LRDB viable in critical applications. Henceforth, less training time for adapting to new DNA samples (in our case, Bacterial samples) is needed. Furthermore, LRDB can be modified concerning the user constraints as the results show a negligible accuracy loss in case of using fewer parameters. We have also assessed the noise-tolerance property, which offers about a 1.439% decline in accuracy for a 15dB noise injection, and the performance metrics show that the model executes in a medium speed range compared with current cutting-edge models.
Satellite imagery segmentation using U-NET
In this blog, we will conduct picture segmentation on a very limited dataset using U-Net, a popular segmentation CNN model. There will also be some customized loss functions used for training reasons, such as dice loss and Jaccard index metrics. The data that we will be working with comes from kaggle. The dataset is called Semantic segmentation of aerial imagery. The dataset has two sorts of files .jpg
Memory Association Networks
Kim, Seokjun, Jang, Jaeeun, Jang, Yeonju, Choi, Seongyune, Kim, Hyeoncheol
Various networks have been designed in the deep learning field to date. Typically, images, sounds, text, hierarchical, and relational data are learned through the networks, and inductive learning is performed. But these networks are limited to specific datasets or specific tasks. Therefore, we designed artificial association networks that can simultaneously learn various datasets in one network like humans. And in the second study, deductive association networks were proposed to perform deductive reasoning.
# 020 Overview of Semantic Segmentation methods - Master Data Science 08.11.2021
In this post, we will see how we can use Neural Networks for the segmentation task. To be more precise, it will be about Semantic Segmentation. The goal of Semantic Segmentation is to label each pixel of an image with a corresponding class. When we start to learn Deep Learning our first experiments are tasks that are usually related to solving classification problems. We need to determine the class label of the object in the image.
Text Classification using Transformers
In this part, we will try to understand the Encoder-Decoder architecture of the Multi-Head Self-Attention Transformer network with some code in PyTorch. There won't be any theory involved(better theoretical version can be found here) just the barebones of the network and how can one write this network on its own in PyTorch. The architecture comprising the Transformer model is divided into two parts -- the Encoder part and the Decoder part. Several other things combine to form the Encoder and Decoder parts. Let's start with the Encoder.
Establishing strong imputation performance of a denoising autoencoder in a wide range of missing data problems
Abiri, Najmeh, Linse, Björn, Edén, Patrik, Ohlsson, Mattias
Dealing with missing data in data analysis is inevitable. Although powerful imputation methods that address this problem exist, there is still much room for improvement. In this study, we examined single imputation based on deep autoencoders, motivated by the apparent success of deep learning to efficiently extract useful dataset features. We have developed a consistent framework for both training and imputation. Moreover, we benchmarked the results against state-of-the-art imputation methods on different data sizes and characteristics. The work was not limited to the one-type variable dataset; we also imputed missing data with multi-type variables, e.g., a combination of binary, categorical, and continuous attributes. To evaluate the imputation methods, we randomly corrupted the complete data, with varying degrees of corruption, and then compared the imputed and original values. In all experiments, the developed autoencoder obtained the smallest error for all ranges of initial data corruption.